OcrV1, Main, Exploration, bibRecord, 000B17

Geometric rectification of camera-captured document images.

Identifieur interne : 000B17 ( Main/Exploration ); précédent : 000B16; suivant : 000B18

Geometric rectification of camera-captured document images.

Auteurs : Jian Liang [États-Unis] ; Daniel Dementhon ; David Doermann

Source :

IEEE transactions on pattern analysis and machine intelligence [ 0162-8828 ] ; 2008.

RBID : pubmed:18276966

English descriptors

KwdEn :
- Algorithms, Artifacts, Artificial Intelligence, Automatic Data Processing (methods), Documentation (methods), Image Enhancement (methods), Image Interpretation, Computer-Assisted (methods), Imaging, Three-Dimensional (methods), Pattern Recognition, Automated (methods), Photography (methods), Reproducibility of Results, Sensitivity and Specificity.
MESH :
- methods : Automatic Data Processing, Documentation, Image Enhancement, Image Interpretation, Computer-Assisted, Imaging, Three-Dimensional, Pattern Recognition, Automated, Photography.
- Algorithms, Artifacts, Artificial Intelligence, Reproducibility of Results, Sensitivity and Specificity.

Abstract

Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and non-contact image capture, which enables many new applications and breathes new life into existing ones. However, camera-captured documents may suffer from distortions caused by non-planar document shape and perspective projection, which lead to failure of current OCR technologies. We present a geometric rectification framework for restoring the frontal-flat view of a document from a single camera-captured image. Our approach estimates 3D document shape from texture flow information obtained directly from the image without requiring additional 3D/metric data or prior camera calibration. Our framework provides a unified solution for both planar and curved documents and can be applied in many, especially mobile, camera-based document analysis applications. Experiments show that our method produces results that are significantly more OCR compatible than the original images.

DOI: 10.1109/TPAMI.2007.70724
PubMed: 18276966

Affiliations:

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Geometric rectification of camera-captured document images.</title>
<author><name sortKey="Liang, Jian" sort="Liang, Jian" uniqKey="Liang J" first="Jian" last="Liang">Jian Liang</name>
<affiliation wicri:level="2"><nlm:affiliation>Amazon.com, 701 5th Avenue #614.B, Seattle, WA 98104, USA. jliang@amazon.com</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Amazon.com, 701 5th Avenue #614.B, Seattle, WA 98104</wicri:regionArea>
<placeName><region type="state">Washington (État)</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
</author>
<author><name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2008">2008</date>
<idno type="doi">10.1109/TPAMI.2007.70724</idno>
<idno type="RBID">pubmed:18276966</idno>
<idno type="pmid">18276966</idno>
<idno type="wicri:Area/PubMed/Corpus">000052</idno>
<idno type="wicri:Area/PubMed/Curation">000052</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000052</idno>
<idno type="wicri:Area/Ncbi/Merge">000048</idno>
<idno type="wicri:Area/Ncbi/Curation">000048</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000048</idno>
<idno type="wicri:doubleKey">0162-8828:2008:Liang J:geometric:rectification:of</idno>
<idno type="wicri:Area/Main/Merge">000B29</idno>
<idno type="wicri:Area/Main/Curation">000B17</idno>
<idno type="wicri:Area/Main/Exploration">000B17</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Geometric rectification of camera-captured document images.</title>
<author><name sortKey="Liang, Jian" sort="Liang, Jian" uniqKey="Liang J" first="Jian" last="Liang">Jian Liang</name>
<affiliation wicri:level="2"><nlm:affiliation>Amazon.com, 701 5th Avenue #614.B, Seattle, WA 98104, USA. jliang@amazon.com</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Amazon.com, 701 5th Avenue #614.B, Seattle, WA 98104</wicri:regionArea>
<placeName><region type="state">Washington (État)</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
</author>
<author><name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
</author>
</analytic>
<series><title level="j">IEEE transactions on pattern analysis and machine intelligence</title>
<idno type="ISSN">0162-8828</idno>
<imprint><date when="2008" type="published">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Artifacts</term>
<term>Artificial Intelligence</term>
<term>Automatic Data Processing (methods)</term>
<term>Documentation (methods)</term>
<term>Image Enhancement (methods)</term>
<term>Image Interpretation, Computer-Assisted (methods)</term>
<term>Imaging, Three-Dimensional (methods)</term>
<term>Pattern Recognition, Automated (methods)</term>
<term>Photography (methods)</term>
<term>Reproducibility of Results</term>
<term>Sensitivity and Specificity</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Automatic Data Processing</term>
<term>Documentation</term>
<term>Image Enhancement</term>
<term>Image Interpretation, Computer-Assisted</term>
<term>Imaging, Three-Dimensional</term>
<term>Pattern Recognition, Automated</term>
<term>Photography</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Artifacts</term>
<term>Artificial Intelligence</term>
<term>Reproducibility of Results</term>
<term>Sensitivity and Specificity</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and non-contact image capture, which enables many new applications and breathes new life into existing ones. However, camera-captured documents may suffer from distortions caused by non-planar document shape and perspective projection, which lead to failure of current OCR technologies. We present a geometric rectification framework for restoring the frontal-flat view of a document from a single camera-captured image. Our approach estimates 3D document shape from texture flow information obtained directly from the image without requiring additional 3D/metric data or prior camera calibration. Our framework provides a unified solution for both planar and curved documents and can be applied in many, especially mobile, camera-based document analysis applications. Experiments show that our method produces results that are significantly more OCR compatible than the original images.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Washington (État)</li>
</region>
</list>
<tree><noCountry><name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
<name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
</noCountry>
<country name="États-Unis"><region name="Washington (État)"><name sortKey="Liang, Jian" sort="Liang, Jian" uniqKey="Liang J" first="Jian" last="Liang">Jian Liang</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000B17 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000B17 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     pubmed:18276966
   |texte=   Geometric rectification of camera-captured document images.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:18276966" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

Serveur d'exploration sur l'OCR

Geometric rectification of camera-captured document images.

Geometric rectification of camera-captured document images.

Source :

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.